Concurrent and fail-safe replicated simulations on heterogeneous networks: An introduction to EcliPSe
نویسندگان
چکیده
This paper presents an overview of the ACESparallel software sysremand, in particular, an introduction to the EcliPSe layer of the system. The ACES system is a fault-tolerant, layered software system for heterogeneous-network based cluster computing. The EcliPSe toolkit, which resides on an upper layer, was constructed specifically for replication-based and domain-decomposition based simulation applications. Ir is not, however, restricted to simulations and supports any message-passing fonn of parallel processing. By raking advantage of networks of heterogeneous machines, generally "idle" workstations, EcliPSe programs can achieve supercomputer level perfonnance with little programming effort. This was a motivating factor in EcliPSe's design. We present an overview of key application-level features in EcliPSe, a new user interface, support for fault-toleram simulation, and perfonnance results for rhree simple bur large scale and representative experiments.
منابع مشابه
On the Effectiveness of Superconcurrent Computations on Heterogeneous Networks
Concurrent computing on networked collections of computer systems is rapidly evolving into a viable technology that is attractive from the economic, performance, and availability perspectives. Several software infrastructures that support such heterogeneous network-based concurrent computing have evolved, and are in use for production-quality high-performance computing. In this paper, we descri...
متن کاملThird-order Decentralized Safe Consensus Protocol for Inter-connected Heterogeneous Vehicular Platoons
In this paper, the stability analysis and control design of heterogeneous traffic flow is considered. It is assumed that the traffic flow consists of infinite number of cooperative non-identical vehicular platoons. Two different networks are investigated in stability analysis of heterogeneous traffic flow: 1) inter-platoon network which deals with the communication topology of lead vehicles and...
متن کاملFail-safe concurrency in the EcliPSe system
Local or wide-area heterogeneous workstation clusters are relatively cheap and highly effective, though inherently unstable operating environments for long-running distributed computations. We found this to be the case in early experiments with a prototype of the EcliPSe system, a software toolkit for replicative applications on heterogeneous workstation clusters. Hardware or network failures i...
متن کاملEnergy-Aware Probabilistic Epidemic Forwarding Method in Heterogeneous Delay Tolerant Networks
Due to the increasing use of wireless communications, infrastructure-less networks such as Delay Tolerant Networks (DTNs) should be highly considered. DTN is most suitable where there is an intermittent connection between communicating nodes such as wireless mobile ad hoc network nodes. In general, a message sending node in DTN copies the message and transmits it to nodes which it encounters. A...
متن کاملAvailable fail-safe systems
Continuity of service and cost-effectiveness are adding new challenges to life critical systems over and above the underlying safety concerns. The introduction of redundant components is a necessary condition for increasing the overall system availability with respect to physical component failures. Here we consider redundancy by means of replicating fail-safe components in a distributed real-t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Simul. Pr. Theory
دوره 3 شماره
صفحات -
تاریخ انتشار 1995